1. Introduction

<<<<<<< HEAD

Discrimination has long been a controversial topic. Forms of discrimination can range greatly but over the last few decades, one notable form of discrimination has been the phenomenon of an income gap that has existed in society (Blinder, 1973). A widely discussed form of this discrepancy in income is gender. The European Commission defines gender pay gap as “the difference between men“s and women“s pay, based on the average difference in gross hourly earnings of all employees“ (European Commission, 2017). In recent years especially, addressing the gender-related pay gap has been a policy matter of many countries namely those belonging to the EU (European Union, 2014).

=======

Discrimination has long been a controversial topic. Forms of discrimination can range greatly but over the last few decades, one notable form of discrimination has been the phenomenon of an income gap that has existed in society (Blinder, 1973). A widely discussed form of this discrepancy in income is gender. The European Commission defines gender pay gap as “the difference between men“s and women“s pay, based on the average difference in gross hourly earnings of all employees“ (European Commission, 2017). In recent years especially, addressing the gender-related pay gap has been a policy matter of many countries namely those belonging to the EU (European Union, 2014).

>>>>>>> 27d27b978d441b867bd2eaeb638b4acc287e9ee0

According to the United Nations (2017), there are aims to alleviate the issue of gender discrimination to ensure gender equality. The European Commission (2017) does state that gender discrimination is prohibited under European law. However, the Office for National Statistics (2017) states that female full-time employees in the UK earn on average 9.4 percent less than male full-time employees in 2016. This raises the question if gender pay discrimination currently exists in the UK, and or if there are other drivers affecting the differences in salary levels.

Consequently, this report aims to identify if pay discrimination currently exists in the United Kingdom. To examine this, we will use data from the Office of National Statistics Quarterly Labour Force Survey for the period of January to March 2017 (ONS, 2017). We will consider factors which may affect the pay gap such as differences in where and how people in the UK tend to work/[namely nature (e.g. ethnicity) and nurture factors (occupation, region of work). Following investigation of the variables and linear regressions, we hope to make inferences that would aid policy creation that aims to minimise the pay gap.

2. Theory

note: What do you expect to find?

Our null hypothesis is that all the predictors will not have a significant effect on salary. The alternative hypothesis is that the predictors do have a significant effect on salary.

We are interested in investigating whether this apparent income gap is a result of actual gender discrimination, which would be in accordance with our resources, or whether this phenomenon arises as a result of other variables’ influences. We will thus investigate it in a linear regression model with relevant variables.

In that aspect, we want to determine other possible causes of the income gap and whether working in different industry sectors or having different educational qualifications can affect salary when included in a linear regression model with salary as the independent variable.

(OR Our theory is that we believe, in line with our resources, that gender discrimination is still present in today’s society. This will mean that in theory, we will expect that gender will have an effect on salary.)

3. Methods

Our research into the dataset uses various statistical tools in R. Ggplot and knitr libraries were used to help visualise the data, dplyr was used for ease of data manipulation and lmtest was used to further our statistical research.

## Warning: package 'lmtest' was built under R version 3.4.2

3.1 Data Description

<<<<<<< HEAD

The dataset initially consisted of 88,528 observations of 739 variables. In order to examine the relationship between salary and gender and explore the possibility of gender discrimination in the UK, we have selected variables which also may impact the pay gap. Our final dataset contains 570 observations of 7 variables. These are outlined below.

=======

The dataset initially consisted of 88,528 observations of 739 variables. This covers 88,528 individuals from the UK. The QLFS aims to ‘inform social, economic and employment policy’ (ONS, 2017b.). As such, the questionnaire covers questions on individual demographics, household characteristics and job information amongst others. In order to examine the relationship between salary and gender and explore the possibility of gender discrimination in the UK, we have selected variables which also may impact the pay gap. Our final dataset contains 570 observations of 7 variables. These are outlined below.

<<<<<<< HEAD >>>>>>> 27d27b978d441b867bd2eaeb638b4acc287e9ee0 ======= >>>>>>> Shala <<<<<<< HEAD ======= >>>>>>> 27d27b978d441b867bd2eaeb638b4acc287e9ee0
Industry summary
Variable Description Further info
Salary Salary of respondent (249.5 - 48000.0) N.b. adjusted for inflation
Age Age of respondent (0-99)
Sex Gender of respondent
Occupation Major occupation group of respondent
Industry Industry sector in main job 1= Distribution, hotels, restaurants
2 = Banking, Finance
Education Highest qualification level
NumEmployee Number of employees at workplace
Religion Religion GB level
MaritalStatus Marital status
Ethnicity Ethnicity of respondent in GB
Region of workplace Region of place of work
<<<<<<< HEAD =======

For ease of analysis we manipulated certain variables. Salary was converted from intervals to continuous data, to allow for a linear regression with Salary as the dependent variable, and all other variables were condensed to a set number of dummy variables.

ADD DUMMY INFO?

>>>>>>> 27d27b978d441b867bd2eaeb638b4acc287e9ee0
summary(facQlfs)
##       Age           Sex                                   Industry  
##  Min.   :16.0   Male  :262   Distribution, hotels, restaurants:111  
##  1st Qu.:30.0   Female:281   Banking, Finance                 : 82  
##  Median :43.0                Public admin, education, health  :203  
##  Mean   :42.1                Other                            :147  
##  3rd Qu.:53.0                                                       
##  Max.   :77.0                                                       
##                                                                     
##             Education      NumEmployee 
##  Degree          :164   Under 50 :268  
##  Higher Education: 51   Under 500:183  
##  A Level         :115   Over 500 : 92  
##  GCSE A*-C       :127                  
##  Other           : 52                  
##  No              : 34                  
##                                        
##                                  Occupation      Salary       
##  Professional                         :105   Min.   :  249.5  
##  Caring, Leisure, Other Service       : 75   1st Qu.:10499.5  
##  Associate Professional, Technical    : 66   Median :18499.5  
##  Administrative, Secretarial          : 61   Mean   :20860.0  
##  Managers, Directors, Senior Officials: 55   3rd Qu.:30499.5  
##  Sales, Customer Service              : 53   Max.   :48000.0  
##  (Other)                              :128                    
##     Marital    Religion  Ethnicity                      workRegion 
##  Single :213   No :208   White:486   South East              :121  
##  Married:268   Yes:335   Asian: 21   Yorkshire and the Humber: 96  
##  Other  : 62             Black: 15   London                  : 91  
##                          Other: 21   South West              : 67  
##                                      North East              : 54  
##                                      West Midlands           : 47  
##                                      (Other)                 : 67
Industry summary
Industry Type #Individuals
Distribution, hotels, restaurants 111
Banking, Finance 82
Public admin, education, health 203
Other 147
Education summary
Education Type #Individuals
Degree 164
Higher Education 51
A Level 115
GCSE A*-C 127
Other 52
No 34
<<<<<<< HEAD
Number of Employees Summary
Number of Employees #Individuals
Under 50 268
Under 500 183
Over 500 92
Occupation Summary
Occupation type #Individuals
Managers, Directors, Senior Officials 55
Professional 105
Associate Professional, Technical 66
Administrative, Secretarial 61
Skilled Trades 47
Caring, Leisure, Other Service 75
Sales, Customer Service 53
Process, Plant, Machine Operatives 31
Elementary 50
Marital Status Summary
Marital status #Individuals
Single 213
Married 268
Other 62
Religion Summary
Religion Type #Individuals
No 208
Yes 335
Ethnicity Summary
Ethnicity Type #Individuals
White 486
Asian 21
Black 15
Other 21
Work regions Summary
Regions of Workplace #Individuals
White 486
Asian 21
Black 15
Other 21
======= >>>>>>> Shala

3.2 Strength and limitations

4. Analysis

4.1 Descriptive statistics

4.1.1 Univariate descriptive of variables

<<<<<<< HEAD

Sex

=======
  • central tendency (mean, mode, median)
  • dispersion (range, variance, maximum, minimum, quartiles & IQR, std.)
  • frequency distribution tables
  • bar charts
  • histograms
Sex
man <- c(415, 532, 686, 773, 764, 757, 749, 656, 621)
woman <- c(366, 492, 573, 613, 568, 557, 529, 518, 405)
ages <- c("21-25", "26-30", "31-35", "36-40", "41-45", "46-50", "51-55", "56-60", "60-65")
dtmw <- data.frame(ages, man, woman)

ggplot(dtmw) +geom_line(aes(x= ages, y=man,col='male'),group=1)+geom_point(aes(x= ages, y=man,col='male'))+ geom_line(aes(x= ages, y=woman,col='female'),group=2) + geom_point(aes(x= ages, y=woman,col='female'))+ylim(0,800)+ scale_color_discrete(name='Sex')+labs(title='Mean gross weekly pays in main job',x='Age',y='Salary')+theme(plot.title=element_text(hjust=0.5))

## Don't know how to automatically pick scale for object of type labelled. Defaulting to continuous.

sextable<- facQlfs %>% select(Sex,Salary) %>% group_by(Sex) %>% summarise(Mean=mean(Salary),'Standard Deviation'=sd(Salary))
kable(sextable,align='l')
>>>>>>> 27d27b978d441b867bd2eaeb638b4acc287e9ee0
Sex Mean Standard Deviation
Male 25812.52 12614.71
Female 16242.32 10833.12

Industry

## Don't know how to automatically pick scale for object of type labelled. Defaulting to continuous.
<<<<<<< HEAD

<<<<<<< HEAD ======= =======

>>>>>>> Shala
industrytable <- facQlfs %>% select(Industry,Salary) %>% group_by(Industry) %>% summarise(Mean=mean(Salary),'Standard Deviation'=sd(Salary))
kable(industrytable,align='l')
>>>>>>> 27d27b978d441b867bd2eaeb638b4acc287e9ee0
Industry Mean Standard Deviation
Distribution, hotels, restaurants 13894.55 9311.661
Banking, Finance 25477.55 14122.353
Public admin, education, health 18955.17 11926.377
Other 26174.29 11810.261
<<<<<<< HEAD

Highest Education Level

MAYBE REMOVE BELOW

ggplot(facQlfs, aes(x=Occupation,fill=Education))+ geom_bar(position=position_dodge())+ labs(title='Highest Education Qualification distribution by Occupation',x='Occupation',y='Count')+ theme(plot.title = element_text(hjust = 0.5),axis.text = element_text(angle= 50, hjust = 0.9,size=6))

ggplot(facQlfs, aes(x=Education,fill=Occupation))+ geom_bar(position=position_dodge())+ labs(title='Occupation distribution among Highest Education Qualification',x='Occupation',y='Count')+ theme(plot.title = element_text(hjust = 0.5),axis.text = element_text(angle= 45, hjust = 0.9))

=======
Highest Education Level
<<<<<<< HEAD

=======

MAYBE REMOVE BELOW

ggplot(facQlfs, aes(x=Occupation,fill=Education))+ geom_bar(position=position_dodge())+ labs(title='Highest Education Qualification distribution by Occupation',x='Occupation',y='Count')+ theme(plot.title = element_text(hjust = 0.5),axis.text = element_text(angle= 50, hjust = 0.9,size=6))

ggplot(facQlfs, aes(x=Education,fill=Occupation))+ geom_bar(position=position_dodge())+ labs(title='Occupation distribution among Highest Education Qualification',x='Occupation',y='Count')+ theme(plot.title = element_text(hjust = 0.5),axis.text = element_text(angle= 45, hjust = 0.9))

>>>>>>> Shala >>>>>>> 27d27b978d441b867bd2eaeb638b4acc287e9ee0
## # A tibble: 6 x 2
##          Education `mean(Salary)`
##             <fctr>          <dbl>
## 1           Degree       27478.49
## 2 Higher Education       20436.90
## 3          A Level       19052.11
## 4        GCSE A*-C       17511.61
## 5            Other       15926.26
## 6               No       15737.79
## # A tibble: 9 x 2
##                              Occupation    percent
##                                  <fctr>      <dbl>
## 1 Managers, Directors, Senior Officials 14.4508671
## 2                          Professional 45.6647399
## 3     Associate Professional, Technical 13.2947977
## 4           Administrative, Secretarial  8.0924855
## 5                        Skilled Trades  0.5780347
## 6        Caring, Leisure, Other Service  5.7803468
## 7               Sales, Customer Service  4.0462428
## 8    Process, Plant, Machine Operatives  1.1560694
## 9                            Elementary  1.7341040
eductable<- facQlfs %>% select(Education,Salary) %>% group_by(Education) %>% summarise(Mean=mean(Salary),'Standard Deviation'=sd(Salary))
kable(eductable,digits = 0, align=c('l','c','r'), caption = 'Salary Mean and Standard Deviation by Education')
Salary Mean and Standard Deviation by Education
Education Mean Standard Deviation
Degree 27478 13489
Higher Education 20437 12662
A Level 19052 11375
GCSE A*-C 17512 11227
Other 15926 9723
No 15738 8761

Employee Numbers in workplace

## Don't know how to automatically pick scale for object of type labelled. Defaulting to continuous.
<<<<<<< HEAD

=======

>>>>>>> 27d27b978d441b867bd2eaeb638b4acc287e9ee0
numemploytable<- facQlfs %>% select(NumEmployee,Salary) %>% group_by(NumEmployee) %>% summarise(Mean=mean(Salary),'Standard Deviation'=sd(Salary))
kable(numemploytable,digits = 0, align=c('l','l','r'), caption = 'Salary Mean and Standard Deviation by # of Employees')
Salary Mean and Standard Deviation by # of Employees
NumEmployee Mean Standard Deviation
Under 50 18305 11951
Under 500 20966 12002
Over 500 28092 13217

Occupations

## Don't know how to automatically pick scale for object of type labelled. Defaulting to continuous.
<<<<<<< HEAD

=======

>>>>>>> 27d27b978d441b867bd2eaeb638b4acc287e9ee0
occuptable<- facQlfs %>% select(Occupation,Salary) %>% group_by(Occupation) %>% summarise(Mean=mean(Salary),'Standard Deviation'=sd(Salary))
kable(occuptable,digits = 0, align='l', caption = 'Salary Mean and Standard Deviation by Occupation')
Salary Mean and Standard Deviation by Occupation
Occupation Mean Standard Deviation
Managers, Directors, Senior Officials 30682 11546
Professional 29018 12956
Associate Professional, Technical 25735 12872
Administrative, Secretarial 17732 10511
Skilled Trades 22152 10931
Caring, Leisure, Other Service 13442 8217
Sales, Customer Service 11954 6063
Process, Plant, Machine Operatives 18488 8791
Elementary 11130 6786

Religion

Ethnicity

Region of Workplace

4.1.2 Bivariate descriptive

occupation industry occupation education

4.2 Correlation analyses

occupation industry occupation education

## 
##  Pearson's product-moment correlation
## 
## data:  qlfs$Education and qlfs$Occupation
## t = 14.009, df = 541, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4513880 0.5751126
## sample estimates:
##       cor 
## 0.5159359
## 
##  Pearson's product-moment correlation
## 
## data:  qlfs$Industry and qlfs$Occupation
## t = -5.2834, df = 541, p-value = 1.841e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.3000600 -0.1399736
## sample estimates:
##        cor 
## -0.2215087

4.3 Hypothesis tests

4.3.1 T-tests

## 
##  Welch Two Sample t-test
## 
## data:  caring$Salary and sales$Salary
## t = 1.1784, df = 125.74, p-value = 0.2409
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1010.806  3986.058
## sample estimates:
## mean of x mean of y 
##  13441.95  11954.32
## 
##  Welch Two Sample t-test
## 
## data:  caring$Salary and elem$Salary
## t = 1.7133, df = 117.36, p-value = 0.08929
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -360.4099 4984.9032
## sample estimates:
## mean of x mean of y 
##  13441.95  11129.70
## 
##  Welch Two Sample t-test
## 
## data:  sales$Salary and elem$Salary
## t = 0.64894, df = 98.141, p-value = 0.5179
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1697.030  3346.272
## sample estimates:
## mean of x mean of y 
##  11954.32  11129.70
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  facQlfs$Salary and facQlfs$Education 
## 
##                  Degree  Higher Education A Level GCSE A*-C Other 
## Higher Education 0.0026  -                -       -         -     
## A Level          1.2e-07 1.0000           -       -         -     
## GCSE A*-C        5.8e-11 0.9639           1.0000  -         -     
## Other            2.6e-08 0.5435           0.9252  1.0000    -     
## No               2.6e-06 0.6690           0.9639  1.0000    1.0000
## 
## P value adjustment method: holm
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  facQlfs$Salary and facQlfs$Sex 
## 
##        Male  
## Female <2e-16
## 
## P value adjustment method: holm
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  facQlfs$Salary and facQlfs$Industry 
## 
##                                 Distribution, hotels, restaurants
## Banking, Finance                1.9e-10                          
## Public admin, education, health 0.0006                           
## Other                           5.4e-15                          
##                                 Banking, Finance
## Banking, Finance                -               
## Public admin, education, health 8.2e-05         
## Other                           0.6680          
##                                 Public admin, education, health
## Banking, Finance                -                              
## Public admin, education, health -                              
## Other                           9.9e-08                        
## 
## P value adjustment method: holm
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  facQlfs$Salary and facQlfs$NumEmployee 
## 
##           Under 50 Under 500
## Under 500 0.023    -        
## Over 500  2.2e-10  1.2e-05  
## 
## P value adjustment method: holm
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  facQlfs$Salary and facQlfs$Occupation 
## 
##                                    Managers, Directors, Senior Officials
## Professional                       1.00000                              
## Associate Professional, Technical  0.12280                              
## Administrative, Secretarial        2.2e-09                              
## Skilled Trades                     0.00097                              
## Caring, Leisure, Other Service     < 2e-16                              
## Sales, Customer Service            < 2e-16                              
## Process, Plant, Machine Operatives 8.6e-06                              
## Elementary                         < 2e-16                              
##                                    Professional
## Professional                       -           
## Associate Professional, Technical  0.37886     
## Administrative, Secretarial        1.8e-09     
## Skilled Trades                     0.00395     
## Caring, Leisure, Other Service     < 2e-16     
## Sales, Customer Service            < 2e-16     
## Process, Plant, Machine Operatives 2.9e-05     
## Elementary                         < 2e-16     
##                                    Associate Professional, Technical
## Professional                       -                                
## Associate Professional, Technical  -                                
## Administrative, Secretarial        0.00043                          
## Skilled Trades                     0.52338                          
## Caring, Leisure, Other Service     3.5e-10                          
## Sales, Customer Service            1.1e-10                          
## Process, Plant, Machine Operatives 0.02619                          
## Elementary                         1.5e-11                          
##                                    Administrative, Secretarial
## Professional                       -                          
## Associate Professional, Technical  -                          
## Administrative, Secretarial        -                          
## Skilled Trades                     0.27701                    
## Caring, Leisure, Other Service     0.20136                    
## Sales, Customer Service            0.05004                    
## Process, Plant, Machine Operatives 1.00000                    
## Elementary                         0.01805                    
##                                    Skilled Trades
## Professional                       -             
## Associate Professional, Technical  -             
## Administrative, Secretarial        -             
## Skilled Trades                     -             
## Caring, Leisure, Other Service     0.00022       
## Sales, Customer Service            3.7e-05       
## Process, Plant, Machine Operatives 0.79574       
## Elementary                         8.6e-06       
##                                    Caring, Leisure, Other Service
## Professional                       -                             
## Associate Professional, Technical  -                             
## Administrative, Secretarial        -                             
## Skilled Trades                     -                             
## Caring, Leisure, Other Service     -                             
## Sales, Customer Service            1.00000                       
## Process, Plant, Machine Operatives 0.25016                       
## Elementary                         1.00000                       
##                                    Sales, Customer Service
## Professional                       -                      
## Associate Professional, Technical  -                      
## Administrative, Secretarial        -                      
## Skilled Trades                     -                      
## Caring, Leisure, Other Service     -                      
## Sales, Customer Service            -                      
## Process, Plant, Machine Operatives 0.08056                
## Elementary                         1.00000                
##                                    Process, Plant, Machine Operatives
## Professional                       -                                 
## Associate Professional, Technical  -                                 
## Administrative, Secretarial        -                                 
## Skilled Trades                     -                                 
## Caring, Leisure, Other Service     -                                 
## Sales, Customer Service            -                                 
## Process, Plant, Machine Operatives -                                 
## Elementary                         0.03473                           
## 
## P value adjustment method: holm

4.3.2 Chi-square tests

## Warning in chisq.test(table(facQlfs$Industry, facQlfs$Occupation)): Chi-
## squared approximation may be incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  table(facQlfs$Industry, facQlfs$Occupation)
## X-squared = 352.87, df = 24, p-value < 2.2e-16
## Warning in chisq.test(table(facQlfs$Education, facQlfs$Occupation)): Chi-
## squared approximation may be incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  table(facQlfs$Education, facQlfs$Occupation)
## X-squared = 252.74, df = 40, p-value < 2.2e-16

4.4 Linear regression

## 
## Call:
## lm(formula = Salary ~ Education + Sex + Industry + NumEmployee + 
##     Occupation + Marital + Religion + Ethnicity + workRegion, 
##     data = facQlfs)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -26967.0  -5948.4    505.5   5967.2  27261.8 
## 
## Coefficients:
##                                              Estimate Std. Error t value
## (Intercept)                                   29672.0     2358.8  12.579
## EducationHigher Education                     -4026.8     1545.4  -2.606
## EducationA Level                              -4842.3     1326.3  -3.651
## EducationGCSE A*-C                            -4643.4     1304.3  -3.560
## EducationOther                                -4196.8     1732.4  -2.422
## EducationNo                                   -4822.0     1955.0  -2.467
## SexFemale                                     -6733.5      950.5  -7.084
## IndustryBanking, Finance                       3323.9     1599.3   2.078
## IndustryPublic admin, education, health        -304.0     1443.3  -0.211
## IndustryOther                                  5704.1     1408.0   4.051
## NumEmployeeUnder 500                            852.8      930.4   0.917
## NumEmployeeOver 500                            5710.0     1213.1   4.707
## OccupationProfessional                        -1061.2     1657.1  -0.640
## OccupationAssociate Professional, Technical   -2819.2     1756.7  -1.605
## OccupationAdministrative, Secretarial         -8265.5     1820.3  -4.541
## OccupationSkilled Trades                      -8147.8     1961.9  -4.153
## OccupationCaring, Leisure, Other Service      -8917.5     1893.5  -4.709
## OccupationSales, Customer Service            -11633.3     2068.8  -5.623
## OccupationProcess, Plant, Machine Operatives -12386.7     2232.6  -5.548
## OccupationElementary                         -14151.0     2031.1  -6.967
## MaritalMarried                                 1826.5      929.8   1.964
## MaritalOther                                   2238.0     1385.1   1.616
## ReligionYes                                   -1139.6      860.2  -1.325
## EthnicityAsian                                -3553.5     2111.1  -1.683
## EthnicityBlack                                -1550.0     2522.3  -0.614
## EthnicityOther                                 -285.9     2141.3  -0.134
## workRegionYorkshire and the Humber             1826.7     1614.9   1.131
## workRegionEast Midlands                        1476.9     1930.2   0.765
## workRegionWest Midlands                       -3400.1     1879.2  -1.809
## workRegionEast of England                      2152.1     2354.0   0.914
## workRegionLondon                               2047.1     1635.5   1.252
## workRegionSouth West                           1021.6     1713.9   0.596
## workRegionSouth East                           1133.0     1544.5   0.734
## workRegionOutside UK                           9102.3     9549.4   0.953
##                                              Pr(>|t|)    
## (Intercept)                                   < 2e-16 ***
## EducationHigher Education                    0.009435 ** 
## EducationA Level                             0.000288 ***
## EducationGCSE A*-C                           0.000406 ***
## EducationOther                               0.015762 *  
## EducationNo                                  0.013970 *  
## SexFemale                                    4.69e-12 ***
## IndustryBanking, Finance                     0.038183 *  
## IndustryPublic admin, education, health      0.833252    
## IndustryOther                                5.89e-05 ***
## NumEmployeeUnder 500                         0.359789    
## NumEmployeeOver 500                          3.24e-06 ***
## OccupationProfessional                       0.522218    
## OccupationAssociate Professional, Technical  0.109162    
## OccupationAdministrative, Secretarial        7.01e-06 ***
## OccupationSkilled Trades                     3.85e-05 ***
## OccupationCaring, Leisure, Other Service     3.21e-06 ***
## OccupationSales, Customer Service            3.10e-08 ***
## OccupationProcess, Plant, Machine Operatives 4.65e-08 ***
## OccupationElementary                         1.00e-11 ***
## MaritalMarried                               0.050024 .  
## MaritalOther                                 0.106753    
## ReligionYes                                  0.185834    
## EthnicityAsian                               0.092949 .  
## EthnicityBlack                               0.539165    
## EthnicityOther                               0.893841    
## workRegionYorkshire and the Humber           0.258528    
## workRegionEast Midlands                      0.444550    
## workRegionWest Midlands                      0.070979 .  
## workRegionEast of England                    0.361025    
## workRegionLondon                             0.211267    
## workRegionSouth West                         0.551404    
## workRegionSouth East                         0.463538    
## workRegionOutside UK                         0.340953    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9206 on 509 degrees of freedom
## Multiple R-squared:  0.5031, Adjusted R-squared:  0.4709 
## F-statistic: 15.62 on 33 and 509 DF,  p-value: < 2.2e-16
## 
## Call:
## lm(formula = Salary ~ Education + Sex + Industry + NumEmployee + 
##     Occupation, data = facQlfs)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -28168.4  -5330.9    716.9   5810.2  28070.7 
## 
## Coefficients:
##                                               Estimate Std. Error t value
## (Intercept)                                   30965.20    1838.43  16.843
## EducationHigher Education                     -3962.11    1553.36  -2.551
## EducationA Level                              -4994.07    1306.26  -3.823
## EducationGCSE A*-C                            -4324.68    1288.09  -3.357
## EducationOther                                -3379.98    1701.91  -1.986
## EducationNo                                   -4100.46    1953.10  -2.099
## SexFemale                                     -7073.80     932.32  -7.587
## IndustryBanking, Finance                       3754.20    1589.81   2.361
## IndustryPublic admin, education, health         -36.14    1442.07  -0.025
## IndustryOther                                  5421.75    1406.48   3.855
## NumEmployeeUnder 500                            897.30     924.66   0.970
## NumEmployeeOver 500                            6050.89    1187.47   5.096
## OccupationProfessional                        -1197.35    1647.72  -0.727
## OccupationAssociate Professional, Technical   -2698.83    1739.08  -1.552
## OccupationAdministrative, Secretarial         -8411.84    1812.10  -4.642
## OccupationSkilled Trades                      -8381.08    1950.88  -4.296
## OccupationCaring, Leisure, Other Service      -9424.12    1849.28  -5.096
## OccupationSales, Customer Service            -11781.38    2042.65  -5.768
## OccupationProcess, Plant, Machine Operatives -12550.62    2227.49  -5.634
## OccupationElementary                         -14916.41    1974.74  -7.554
##                                              Pr(>|t|)    
## (Intercept)                                   < 2e-16 ***
## EducationHigher Education                    0.011035 *  
## EducationA Level                             0.000148 ***
## EducationGCSE A*-C                           0.000844 ***
## EducationOther                               0.047555 *  
## EducationNo                                  0.036255 *  
## SexFemale                                    1.51e-13 ***
## IndustryBanking, Finance                     0.018571 *  
## IndustryPublic admin, education, health      0.980016    
## IndustryOther                                0.000130 ***
## NumEmployeeUnder 500                         0.332295    
## NumEmployeeOver 500                          4.86e-07 ***
## OccupationProfessional                       0.467755    
## OccupationAssociate Professional, Technical  0.121297    
## OccupationAdministrative, Secretarial        4.37e-06 ***
## OccupationSkilled Trades                     2.07e-05 ***
## OccupationCaring, Leisure, Other Service     4.85e-07 ***
## OccupationSales, Customer Service            1.38e-08 ***
## OccupationProcess, Plant, Machine Operatives 2.87e-08 ***
## OccupationElementary                         1.90e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9295 on 523 degrees of freedom
## Multiple R-squared:  0.4795, Adjusted R-squared:  0.4606 
## F-statistic: 25.35 on 19 and 523 DF,  p-value: < 2.2e-16
stargazer(list(facQlfs.lm1, facQlfs.lm2), digits = 2)
## 
## % Table created by stargazer v.5.2 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu
## % Date and time: Thu, Oct 12, 2017 - 13:57:53
## \begin{table}[!htbp] \centering 
##   \caption{} 
##   \label{} 
## \begin{tabular}{@{\extracolsep{5pt}}lcc} 
## \\[-1.8ex]\hline 
## \hline \\[-1.8ex] 
##  & \multicolumn{2}{c}{\textit{Dependent variable:}} \\ 
## \cline{2-3} 
## \\[-1.8ex] & \multicolumn{2}{c}{Salary} \\ 
## \\[-1.8ex] & (1) & (2)\\ 
## \hline \\[-1.8ex] 
##  EducationHigher Education & $-$4,026.84$^{***}$ & $-$3,962.11$^{**}$ \\ 
##   & (1,545.35) & (1,553.36) \\ 
##   & & \\ 
##  EducationA Level & $-$4,842.31$^{***}$ & $-$4,994.07$^{***}$ \\ 
##   & (1,326.33) & (1,306.26) \\ 
##   & & \\ 
##  EducationGCSE A\textasteriskcentered -C & $-$4,643.45$^{***}$ & $-$4,324.68$^{***}$ \\ 
##   & (1,304.35) & (1,288.09) \\ 
##   & & \\ 
##  EducationOther & $-$4,196.82$^{**}$ & $-$3,379.98$^{**}$ \\ 
##   & (1,732.44) & (1,701.91) \\ 
##   & & \\ 
##  EducationNo & $-$4,822.00$^{**}$ & $-$4,100.46$^{**}$ \\ 
##   & (1,954.95) & (1,953.10) \\ 
##   & & \\ 
##  SexFemale & $-$6,733.45$^{***}$ & $-$7,073.80$^{***}$ \\ 
##   & (950.54) & (932.32) \\ 
##   & & \\ 
##  IndustryBanking, Finance & 3,323.91$^{**}$ & 3,754.20$^{**}$ \\ 
##   & (1,599.34) & (1,589.81) \\ 
##   & & \\ 
##  IndustryPublic admin, education, health & $-$304.01 & $-$36.14 \\ 
##   & (1,443.26) & (1,442.07) \\ 
##   & & \\ 
##  IndustryOther & 5,704.14$^{***}$ & 5,421.75$^{***}$ \\ 
##   & (1,407.96) & (1,406.48) \\ 
##   & & \\ 
##  NumEmployeeUnder 500 & 852.78 & 897.30 \\ 
##   & (930.38) & (924.66) \\ 
##   & & \\ 
##  NumEmployeeOver 500 & 5,710.04$^{***}$ & 6,050.89$^{***}$ \\ 
##   & (1,213.08) & (1,187.47) \\ 
##   & & \\ 
##  OccupationProfessional & $-$1,061.17 & $-$1,197.35 \\ 
##   & (1,657.11) & (1,647.72) \\ 
##   & & \\ 
##  OccupationAssociate Professional, Technical & $-$2,819.16 & $-$2,698.83 \\ 
##   & (1,756.73) & (1,739.08) \\ 
##   & & \\ 
##  OccupationAdministrative, Secretarial & $-$8,265.49$^{***}$ & $-$8,411.84$^{***}$ \\ 
##   & (1,820.32) & (1,812.10) \\ 
##   & & \\ 
##  OccupationSkilled Trades & $-$8,147.81$^{***}$ & $-$8,381.08$^{***}$ \\ 
##   & (1,961.93) & (1,950.88) \\ 
##   & & \\ 
##  OccupationCaring, Leisure, Other Service & $-$8,917.46$^{***}$ & $-$9,424.12$^{***}$ \\ 
##   & (1,893.52) & (1,849.28) \\ 
##   & & \\ 
##  OccupationSales, Customer Service & $-$11,633.34$^{***}$ & $-$11,781.38$^{***}$ \\ 
##   & (2,068.85) & (2,042.65) \\ 
##   & & \\ 
##  OccupationProcess, Plant, Machine Operatives & $-$12,386.71$^{***}$ & $-$12,550.62$^{***}$ \\ 
##   & (2,232.65) & (2,227.49) \\ 
##   & & \\ 
##  OccupationElementary & $-$14,151.02$^{***}$ & $-$14,916.41$^{***}$ \\ 
##   & (2,031.08) & (1,974.74) \\ 
##   & & \\ 
##  MaritalMarried & 1,826.49$^{*}$ &  \\ 
##   & (929.78) &  \\ 
##   & & \\ 
##  MaritalOther & 2,238.00 &  \\ 
##   & (1,385.06) &  \\ 
##   & & \\ 
##  ReligionYes & $-$1,139.62 &  \\ 
##   & (860.22) &  \\ 
##   & & \\ 
##  EthnicityAsian & $-$3,553.47$^{*}$ &  \\ 
##   & (2,111.13) &  \\ 
##   & & \\ 
##  EthnicityBlack & $-$1,549.96 &  \\ 
##   & (2,522.35) &  \\ 
##   & & \\ 
##  EthnicityOther & $-$285.90 &  \\ 
##   & (2,141.35) &  \\ 
##   & & \\ 
##  workRegionYorkshire and the Humber & 1,826.66 &  \\ 
##   & (1,614.88) &  \\ 
##   & & \\ 
##  workRegionEast Midlands & 1,476.86 &  \\ 
##   & (1,930.21) &  \\ 
##   & & \\ 
##  workRegionWest Midlands & $-$3,400.14$^{*}$ &  \\ 
##   & (1,879.16) &  \\ 
##   & & \\ 
##  workRegionEast of England & 2,152.12 &  \\ 
##   & (2,354.01) &  \\ 
##   & & \\ 
##  workRegionLondon & 2,047.13 &  \\ 
##   & (1,635.53) &  \\ 
##   & & \\ 
##  workRegionSouth West & 1,021.60 &  \\ 
##   & (1,713.94) &  \\ 
##   & & \\ 
##  workRegionSouth East & 1,132.99 &  \\ 
##   & (1,544.45) &  \\ 
##   & & \\ 
##  workRegionOutside UK & 9,102.29 &  \\ 
##   & (9,549.41) &  \\ 
##   & & \\ 
##  Constant & 29,672.04$^{***}$ & 30,965.20$^{***}$ \\ 
##   & (2,358.79) & (1,838.43) \\ 
##   & & \\ 
## \hline \\[-1.8ex] 
## Observations & 543 & 543 \\ 
## R$^{2}$ & 0.50 & 0.48 \\ 
## Adjusted R$^{2}$ & 0.47 & 0.46 \\ 
## Residual Std. Error & 9,205.64 (df = 509) & 9,295.25 (df = 523) \\ 
## F Statistic & 15.62$^{***}$ (df = 33; 509) & 25.35$^{***}$ (df = 19; 523) \\ 
## \hline 
## \hline \\[-1.8ex] 
## \textit{Note:}  & \multicolumn{2}{r}{$^{*}$p$<$0.1; $^{**}$p$<$0.05; $^{***}$p$<$0.01} \\ 
## \end{tabular} 
## \end{table}

Null hypothesis for anova is that the mean(average value of the dependent variable) is the same for all groups

## 
##  studentized Breusch-Pagan test
## 
## data:  Salary ~ Education + Sex + Industry + NumEmployee + Occupation
## BP = 37.53, df = 19, p-value = 0.006807

5. Discussion

5.1 What results did you find

5.2 Why is it interesting

5.3 What would be the next steps

6. References

Notes:

parallel plots message table for results anova analysis t-test tables mean table for each boxplot whether include age in linear regression